Overview

Dataset statistics

Number of variables22
Number of observations42535
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.1 MiB
Average record size in memory741.3 B

Variable types

Numeric9
Categorical8
Boolean5

Alerts

Hours of Work is highly correlated with SalaryHigh correlation
Salary is highly correlated with Hours of WorkHigh correlation
Hours of Work is highly correlated with SalaryHigh correlation
Care of Children is highly correlated with FamilyHigh correlation
Family is highly correlated with Care of ChildrenHigh correlation
Salary is highly correlated with Hours of WorkHigh correlation
Hours of Work is highly correlated with SalaryHigh correlation
Salary is highly correlated with Hours of WorkHigh correlation
Owned Entity is highly correlated with Self EmployedHigh correlation
English Proficiency is highly correlated with EthnicHigh correlation
Maternity is highly correlated with FamilyHigh correlation
Working is highly correlated with Finding WorkHigh correlation
Self Employed is highly correlated with Owned EntityHigh correlation
Finding Work is highly correlated with WorkingHigh correlation
Family is highly correlated with MaternityHigh correlation
Ethnic is highly correlated with English ProficiencyHigh correlation
Ethnic is highly correlated with English ProficiencyHigh correlation
English Proficiency is highly correlated with Ethnic and 1 other fieldsHigh correlation
Maternity is highly correlated with Care of ChildrenHigh correlation
Working is highly correlated with Self Employed and 3 other fieldsHigh correlation
Self Employed is highly correlated with Working and 2 other fieldsHigh correlation
Owned Entity is highly correlated with Self EmployedHigh correlation
Hours of Work is highly correlated with Working and 2 other fieldsHigh correlation
Finding Work is highly correlated with Working and 3 other fieldsHigh correlation
Care of Children is highly correlated with Maternity and 1 other fieldsHigh correlation
Education Level is highly correlated with English Proficiency and 1 other fieldsHigh correlation
Family is highly correlated with Care of ChildrenHigh correlation
Salary is highly correlated with Working and 3 other fieldsHigh correlation
Hours of Work has 17251 (40.6%) zeros Zeros
Care of Family has 7457 (17.5%) zeros Zeros
Salary has 15618 (36.7%) zeros Zeros

Reproduction

Analysis started2022-02-06 02:42:03.850673
Analysis finished2022-02-06 02:42:25.166163
Duration21.32 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Age
Real number (ℝ≥0)

Distinct40
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.03676972
Minimum7
Maximum47
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.3 KiB
2022-02-06T13:42:25.746430image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile17
Q122
median25
Q328
95-th percentile34
Maximum47
Range40
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.016754953
Coefficient of variation (CV)0.2003754881
Kurtosis-0.001978715106
Mean25.03676972
Median Absolute Deviation (MAD)3
Skewness0.1949468627
Sum1064939
Variance25.16783026
MonotonicityNot monotonic
2022-02-06T13:42:25.935287image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
253375
 
7.9%
243293
 
7.7%
233283
 
7.7%
263266
 
7.7%
273034
 
7.1%
222945
 
6.9%
282733
 
6.4%
212538
 
6.0%
292313
 
5.4%
202212
 
5.2%
Other values (30)13543
31.8%
ValueCountFrequency (%)
71
 
< 0.1%
95
 
< 0.1%
1014
 
< 0.1%
1135
 
0.1%
1268
 
0.2%
13141
 
0.3%
14266
 
0.6%
15395
 
0.9%
16661
1.6%
171028
2.4%
ValueCountFrequency (%)
471
 
< 0.1%
462
 
< 0.1%
453
 
< 0.1%
444
 
< 0.1%
4311
 
< 0.1%
4227
 
0.1%
4133
 
0.1%
4061
 
0.1%
39106
0.2%
38181
0.4%

Gender
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
Female
21370 
Male
16928 
Other
4237 

Length

Max length6
Median length6
Mean length5.104431645
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowFemale
4th rowFemale
5th rowOther

Common Values

ValueCountFrequency (%)
Female21370
50.2%
Male16928
39.8%
Other4237
 
10.0%

Length

2022-02-06T13:42:26.091187image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:26.179137image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
female21370
50.2%
male16928
39.8%
other4237
 
10.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Location
Real number (ℝ≥0)

Distinct373
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2119.255272
Minimum2000
Maximum2545
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size332.4 KiB
2022-02-06T13:42:26.279082image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2007
Q12040
median2083
Q32145
95-th percentile2493
Maximum2545
Range545
Interquartile range (IQR)105

Descriptive statistics

Standard deviation122.3564468
Coefficient of variation (CV)0.05773558683
Kurtosis3.83442088
Mean2119.255272
Median Absolute Deviation (MAD)50
Skewness2.022080736
Sum90142523
Variance14971.10007
MonotonicityNot monotonic
2022-02-06T13:42:26.428137image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2011306
 
0.7%
2012303
 
0.7%
2013300
 
0.7%
2032284
 
0.7%
2017282
 
0.7%
2000282
 
0.7%
2016282
 
0.7%
2071281
 
0.7%
2025279
 
0.7%
2007279
 
0.7%
Other values (363)39657
93.2%
ValueCountFrequency (%)
2000282
0.7%
2001264
0.6%
2002254
0.6%
2003261
0.6%
2004269
0.6%
2005270
0.6%
2006272
0.6%
2007279
0.7%
2008276
0.6%
2009251
0.6%
ValueCountFrequency (%)
25451
 
< 0.1%
25412
 
< 0.1%
25362
 
< 0.1%
25333
 
< 0.1%
25315
< 0.1%
25304
 
< 0.1%
25296
< 0.1%
25287
< 0.1%
25273
 
< 0.1%
252612
< 0.1%

Ethnic
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
No
19350 
White
7682 
Middle East
3819 
Hispanic
3011 
Asian
2966 
Other values (3)
5707 

Length

Max length11
Median length5
Mean length4.657646644
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowWhite
4th rowAsian
5th rowMiddle East

Common Values

ValueCountFrequency (%)
No19350
45.5%
White7682
 
18.1%
Middle East3819
 
9.0%
Hispanic3011
 
7.1%
Asian2966
 
7.0%
Other2845
 
6.7%
Aboriginal1939
 
4.6%
African923
 
2.2%

Length

2022-02-06T13:42:26.570059image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:26.655020image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
no19350
41.7%
white7682
 
16.6%
middle3819
 
8.2%
east3819
 
8.2%
hispanic3011
 
6.5%
asian2966
 
6.4%
other2845
 
6.1%
aboriginal1939
 
4.2%
african923
 
2.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Religion
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
Christian
11241 
Catholic
10758 
Islam
7171 
Buddhism
7072 
Hinduism
3432 
Other values (2)
2861 

Length

Max length9
Median length8
Mean length7.501492888
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBuddhism
2nd rowCatholic
3rd rowBuddhism
4th rowCatholic
5th rowBuddhism

Common Values

ValueCountFrequency (%)
Christian11241
26.4%
Catholic10758
25.3%
Islam7171
16.9%
Buddhism7072
16.6%
Hinduism3432
 
8.1%
Other2078
 
4.9%
No783
 
1.8%

Length

2022-02-06T13:42:26.781945image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:26.867904image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
christian11241
26.4%
catholic10758
25.3%
islam7171
16.9%
buddhism7072
16.6%
hinduism3432
 
8.1%
other2078
 
4.9%
no783
 
1.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Citizen
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
True
34495 
False
8040 
ValueCountFrequency (%)
True34495
81.1%
False8040
 
18.9%
2022-02-06T13:42:26.947842image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

English Proficiency
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
0.0
23348 
4.0
8549 
3.0
4605 
1.0
3660 
2.0
2373 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row4.0
4th row0.0
5th row3.0

Common Values

ValueCountFrequency (%)
0.023348
54.9%
4.08549
 
20.1%
3.04605
 
10.8%
1.03660
 
8.6%
2.02373
 
5.6%

Length

2022-02-06T13:42:27.022346image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:27.101521image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
0.023348
54.9%
4.08549
 
20.1%
3.04605
 
10.8%
1.03660
 
8.6%
2.02373
 
5.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Maternity
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
Single
25791 
Married
16744 

Length

Max length7
Median length6
Mean length6.393652286
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMarried
2nd rowMarried
3rd rowMarried
4th rowMarried
5th rowSingle

Common Values

ValueCountFrequency (%)
Single25791
60.6%
Married16744
39.4%

Length

2022-02-06T13:42:27.194452image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:27.266422image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
single25791
60.6%
married16744
39.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Working
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
True
25284 
False
17251 
ValueCountFrequency (%)
True25284
59.4%
False17251
40.6%
2022-02-06T13:42:27.306388image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Self Employed
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
False
34909 
True
7626 
ValueCountFrequency (%)
False34909
82.1%
True7626
 
17.9%
2022-02-06T13:42:27.462299image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Owned Entity
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
No
34909 
Unincorporated
7246 
Incorporated
 
380

Length

Max length14
Median length2
Mean length4.133584107
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowUnincorporated
3rd rowUnincorporated
4th rowNo
5th rowUnincorporated

Common Values

ValueCountFrequency (%)
No34909
82.1%
Unincorporated7246
 
17.0%
Incorporated380
 
0.9%

Length

2022-02-06T13:42:27.538272image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:27.614888image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
no34909
82.1%
unincorporated7246
 
17.0%
incorporated380
 
0.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Hours of Work
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct25285
Distinct (%)59.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.81556266
Minimum0
Maximum57.0263113
Zeros17251
Zeros (%)40.6%
Negative0
Negative (%)0.0%
Memory size332.4 KiB
2022-02-06T13:42:27.716830image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median30.04602315
Q336.0281862
95-th percentile41.83458562
Maximum57.0263113
Range57.0263113
Interquartile range (IQR)36.0281862

Descriptive statistics

Standard deviation17.61909881
Coefficient of variation (CV)0.8464387488
Kurtosis-1.761455601
Mean20.81556266
Median Absolute Deviation (MAD)9.963197792
Skewness-0.2422505907
Sum885389.9578
Variance310.432643
MonotonicityNot monotonic
2022-02-06T13:42:27.851769image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
017251
40.6%
39.365005081
 
< 0.1%
38.806345811
 
< 0.1%
36.17722211
 
< 0.1%
32.460604961
 
< 0.1%
38.950367791
 
< 0.1%
42.588123081
 
< 0.1%
34.371261031
 
< 0.1%
29.35012621
 
< 0.1%
37.245092831
 
< 0.1%
Other values (25275)25275
59.4%
ValueCountFrequency (%)
017251
40.6%
10.246210821
 
< 0.1%
15.772410631
 
< 0.1%
16.322170671
 
< 0.1%
16.350213061
 
< 0.1%
16.711719161
 
< 0.1%
17.33040941
 
< 0.1%
17.426254781
 
< 0.1%
17.811786911
 
< 0.1%
18.094250141
 
< 0.1%
ValueCountFrequency (%)
57.02631131
< 0.1%
56.161383691
< 0.1%
55.189918821
< 0.1%
54.916223891
< 0.1%
54.380542971
< 0.1%
53.964362931
< 0.1%
53.788757441
< 0.1%
53.35969311
< 0.1%
52.439154971
< 0.1%
52.382078261
< 0.1%

Finding Work
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
False
27009 
True
15526 
ValueCountFrequency (%)
False27009
63.5%
True15526
36.5%
2022-02-06T13:42:27.948707image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Care of Children
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct39
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.8706477
Minimum2
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.3 KiB
2022-02-06T13:42:28.031651image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile10
Q113
median17
Q333
95-th percentile37
Maximum40
Range38
Interquartile range (IQR)20

Descriptive statistics

Standard deviation10.15207438
Coefficient of variation (CV)0.4641871846
Kurtosis-1.610634184
Mean21.8706477
Median Absolute Deviation (MAD)6
Skewness0.3351281803
Sum930268
Variance103.0646143
MonotonicityNot monotonic
2022-02-06T13:42:28.156578image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
133321
 
7.8%
153246
 
7.6%
143236
 
7.6%
342894
 
6.8%
122818
 
6.6%
352789
 
6.6%
162635
 
6.2%
332489
 
5.9%
362303
 
5.4%
112129
 
5.0%
Other values (29)14675
34.5%
ValueCountFrequency (%)
21
 
< 0.1%
31
 
< 0.1%
410
 
< 0.1%
516
 
< 0.1%
661
 
0.1%
7238
 
0.6%
8479
 
1.1%
9843
 
2.0%
101520
3.6%
112129
5.0%
ValueCountFrequency (%)
4024
 
0.1%
39185
 
0.4%
38598
 
1.4%
371377
3.2%
362303
5.4%
352789
6.6%
342894
6.8%
332489
5.9%
321821
4.3%
311121
 
2.6%

Care of Family
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.512260491
Minimum0
Maximum6
Zeros7457
Zeros (%)17.5%
Negative0
Negative (%)0.0%
Memory size166.3 KiB
2022-02-06T13:42:28.271309image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.059779884
Coefficient of variation (CV)0.7007918875
Kurtosis-0.174466929
Mean1.512260491
Median Absolute Deviation (MAD)1
Skewness0.4436011901
Sum64324
Variance1.123133403
MonotonicityNot monotonic
2022-02-06T13:42:28.359400image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
114955
35.2%
212805
30.1%
07457
17.5%
35694
 
13.4%
41449
 
3.4%
5169
 
0.4%
66
 
< 0.1%
ValueCountFrequency (%)
07457
17.5%
114955
35.2%
212805
30.1%
35694
 
13.4%
41449
 
3.4%
5169
 
0.4%
66
 
< 0.1%
ValueCountFrequency (%)
66
 
< 0.1%
5169
 
0.4%
41449
 
3.4%
35694
 
13.4%
212805
30.1%
114955
35.2%
07457
17.5%

Domestic Activities Hours
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.603197367
Minimum0
Maximum8
Zeros340
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size166.3 KiB
2022-02-06T13:42:28.464807image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q35
95-th percentile6
Maximum8
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4085062
Coefficient of variation (CV)0.3909045374
Kurtosis-0.2592881319
Mean3.603197367
Median Absolute Deviation (MAD)1
Skewness0.07550304718
Sum153262
Variance1.983889715
MonotonicityNot monotonic
2022-02-06T13:42:28.557276image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
411033
25.9%
310952
25.7%
57440
17.5%
26683
15.7%
62943
 
6.9%
12337
 
5.5%
7743
 
1.7%
0340
 
0.8%
864
 
0.2%
ValueCountFrequency (%)
0340
 
0.8%
12337
 
5.5%
26683
15.7%
310952
25.7%
411033
25.9%
57440
17.5%
62943
 
6.9%
7743
 
1.7%
864
 
0.2%
ValueCountFrequency (%)
864
 
0.2%
7743
 
1.7%
62943
 
6.9%
57440
17.5%
411033
25.9%
310952
25.7%
26683
15.7%
12337
 
5.5%
0340
 
0.8%

Volunteered Hours
Real number (ℝ≥0)

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.997696015
Minimum0
Maximum20
Zeros10
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.3 KiB
2022-02-06T13:42:28.668223image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q16
median8
Q310
95-th percentile12
Maximum20
Range20
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.541524121
Coefficient of variation (CV)0.3177820358
Kurtosis0.0550381149
Mean7.997696015
Median Absolute Deviation (MAD)2
Skewness0.2676954579
Sum340182
Variance6.459344858
MonotonicityNot monotonic
2022-02-06T13:42:28.777700image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
86588
15.5%
76504
15.3%
95769
13.6%
65454
12.8%
104527
10.6%
53616
8.5%
113080
7.2%
42014
 
4.7%
121882
 
4.4%
131049
 
2.5%
Other values (11)2052
 
4.8%
ValueCountFrequency (%)
010
 
< 0.1%
156
 
0.1%
2275
 
0.6%
3849
 
2.0%
42014
 
4.7%
53616
8.5%
65454
12.8%
76504
15.3%
86588
15.5%
95769
13.6%
ValueCountFrequency (%)
202
 
< 0.1%
198
 
< 0.1%
188
 
< 0.1%
1732
 
0.1%
1685
 
0.2%
15227
 
0.5%
14500
 
1.2%
131049
 
2.5%
121882
4.4%
113080
7.2%

Education Level
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.412954038
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size332.4 KiB
2022-02-06T13:42:28.891499image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median4
Q37
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.417299974
Coefficient of variation (CV)0.7743792356
Kurtosis-1.295012157
Mean4.412954038
Median Absolute Deviation (MAD)3
Skewness0.4799368925
Sum187705
Variance11.67793912
MonotonicityNot monotonic
2022-02-06T13:42:29.009295image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
115367
36.1%
106566
15.4%
42986
 
7.0%
32929
 
6.9%
22867
 
6.7%
52862
 
6.7%
62638
 
6.2%
72359
 
5.5%
82087
 
4.9%
91874
 
4.4%
ValueCountFrequency (%)
115367
36.1%
22867
 
6.7%
32929
 
6.9%
42986
 
7.0%
52862
 
6.7%
62638
 
6.2%
72359
 
5.5%
82087
 
4.9%
91874
 
4.4%
106566
15.4%
ValueCountFrequency (%)
106566
15.4%
91874
 
4.4%
82087
 
4.9%
72359
 
5.5%
62638
 
6.2%
52862
 
6.7%
42986
 
7.0%
32929
 
6.9%
22867
 
6.7%
115367
36.1%

Family
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
1
17650 
0
11064 
2
10775 
3
2799 
4
 
247

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row3
4th row2
5th row0

Common Values

ValueCountFrequency (%)
117650
41.5%
011064
26.0%
210775
25.3%
32799
 
6.6%
4247
 
0.6%

Length

2022-02-06T13:42:29.131226image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:29.210196image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
117650
41.5%
011064
26.0%
210775
25.3%
32799
 
6.6%
4247
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
True
42121 
False
 
414
ValueCountFrequency (%)
True42121
99.0%
False414
 
1.0%
2022-02-06T13:42:29.267158image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Dwell Type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
With mortgage
19194 
Owned outright
8947 
Other
5367 
Rent-free
5091 
Rented
3936 

Length

Max length14
Median length13
Mean length11.07440931
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWith mortgage
2nd rowOwned outright
3rd rowWith mortgage
4th rowRent-free
5th rowOther

Common Values

ValueCountFrequency (%)
With mortgage19194
45.1%
Owned outright8947
21.0%
Other5367
 
12.6%
Rent-free5091
 
12.0%
Rented3936
 
9.3%

Length

2022-02-06T13:42:29.459038image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-02-06T13:42:29.539991image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
with19194
27.2%
mortgage19194
27.2%
owned8947
12.7%
outright8947
12.7%
other5367
 
7.6%
rent-free5091
 
7.2%
rented3936
 
5.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Salary
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct569
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18381.85729
Minimum0
Maximum120000
Zeros15618
Zeros (%)36.7%
Negative0
Negative (%)0.0%
Memory size332.4 KiB
2022-02-06T13:42:29.654876image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median13400
Q334900
95-th percentile50000
Maximum120000
Range120000
Interquartile range (IQR)34900

Descriptive statistics

Standard deviation18902.50256
Coefficient of variation (CV)1.028323866
Kurtosis-0.3836916549
Mean18381.85729
Median Absolute Deviation (MAD)13400
Skewness0.6723427038
Sum781872300
Variance357304603.1
MonotonicityNot monotonic
2022-02-06T13:42:29.810863image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
015618
36.7%
40000796
 
1.9%
30000785
 
1.8%
50000649
 
1.5%
20000535
 
1.3%
60000384
 
0.9%
10000322
 
0.8%
43400273
 
0.6%
46100244
 
0.6%
45900238
 
0.6%
Other values (559)22691
53.3%
ValueCountFrequency (%)
015618
36.7%
1008
 
< 0.1%
20025
 
0.1%
30023
 
0.1%
40028
 
0.1%
50020
 
< 0.1%
60054
 
0.1%
70042
 
0.1%
80019
 
< 0.1%
90031
 
0.1%
ValueCountFrequency (%)
1200001
 
< 0.1%
1100007
 
< 0.1%
10000028
 
0.1%
9000054
 
0.1%
80000123
 
0.3%
70000214
0.5%
60000384
0.9%
591001
 
< 0.1%
587003
 
< 0.1%
581001
 
< 0.1%

Interactions

2022-02-06T13:42:22.406478image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:11.266238image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.024256image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.323641image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.696609image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.003363image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:18.396283image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.761895image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.169855image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:22.552393image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:11.852271image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.176169image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.463616image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.847525image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.150279image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:18.580177image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.911888image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.311192image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:22.690315image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.017177image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.317089image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.601527image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.990442image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.287332image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:18.725088image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.056929image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.445550image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:22.827237image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.154098image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.470001image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.732452image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:16.134360image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.431947image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:18.868027image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.194928image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.583456image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:22.970308image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.299009image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.616917image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.978328image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:16.283276image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.572865image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.010946image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.339884image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.722386image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:23.110218image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.438529image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.757910image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.110246image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:16.421552image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.709770image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.155950image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.473900image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.855950image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:23.257150image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.593520image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:13.906842image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.249943image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:16.569471image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:17.855696image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.304894image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.624955image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:21.998133image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:23.414054image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.748425image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.059747image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.416762image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:16.722384image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:18.127531image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.469890image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.779930image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:22.140630image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:23.551965image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:12.884336image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:14.187663image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:15.551699image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:16.860298image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:18.262471image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:19.611838image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:20.914862image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-02-06T13:42:22.266558image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Correlations

2022-02-06T13:42:29.944770image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-02-06T13:42:30.160355image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-02-06T13:42:30.362249image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-02-06T13:42:30.568121image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-02-06T13:42:30.791003image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-02-06T13:42:23.847806image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
A simple visualization of nullity by column.
2022-02-06T13:42:24.510139image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.